contextual dynamic pricing
Improved Algorithms for Contextual Dynamic Pricing
In contextual dynamic pricing, a seller sequentially prices goods based on contextual information. Buyers will purchase products only if the prices are below their valuations.The goal of the seller is to design a pricing strategy that collects as much revenue as possible. We focus on two different valuation models. The first assumes that valuations linearly depend on the context and are further distorted by noise. Under minor regularity assumptions, our algorithm achieves an optimal regret bound of \tilde{\mathcal{O}}(T {2/3}), improving the existing results.
Transfer Learning for Nonparametric Contextual Dynamic Pricing
Wang, Fan, Jiang, Feiyu, Zhao, Zifeng, Yu, Yi
Dynamic pricing strategies are crucial for firms to maximize revenue by adjusting prices based on market conditions and customer characteristics. However, designing optimal pricing strategies becomes challenging when historical data are limited, as is often the case when launching new products or entering new markets. One promising approach to overcome this limitation is to leverage information from related products or markets to inform the focal pricing decisions. In this paper, we explore transfer learning for nonparametric contextual dynamic pricing under a covariate shift model, where the marginal distributions of covariates differ between source and target domains while the reward functions remain the same. We propose a novel Transfer Learning for Dynamic Pricing (TLDP) algorithm that can effectively leverage pre-collected data from a source domain to enhance pricing decisions in the target domain. The regret upper bound of TLDP is established under a simple Lipschitz condition on the reward function. To establish the optimality of TLDP, we further derive a matching minimax lower bound, which includes the target-only scenario as a special case and is presented for the first time in the literature. Extensive numerical experiments validate our approach, demonstrating its superiority over existing methods and highlighting its practical utility in real-world applications.
- North America > United States (0.28)
- Europe > Austria > Vienna (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Fairness-aware Contextual Dynamic Pricing with Strategic Buyers
Contextual pricing strategies are prevalent in online retailing, where the seller adjusts prices based on products' attributes and buyers' characteristics. Although such strategies can enhance seller's profits, they raise concerns about fairness when significant price disparities emerge among specific groups, such as gender or race. These disparities can lead to adverse perceptions of fairness among buyers and may even violate the law and regulation. In contrast, price differences can incentivize disadvantaged buyers to strategically manipulate their group identity to obtain a lower price. In this paper, we investigate contextual dynamic pricing with fairness constraints, taking into account buyers' strategic behaviors when their group status is private and unobservable from the seller. We propose a dynamic pricing policy that simultaneously achieves price fairness and discourages strategic behaviors. Our policy achieves an upper bound of $O(\sqrt{T}+H(T))$ regret over $T$ time horizons, where the term $H(T)$ arises from buyers' assessment of the fairness of the pricing policy based on their learned price difference. When buyers are able to learn the fairness of the price policy, this upper bound reduces to $O(\sqrt{T})$. We also prove an $\Omega(\sqrt{T})$ regret lower bound of any pricing policy under our problem setting. We support our findings with extensive experimental evidence, showcasing our policy's effectiveness. In our real data analysis, we observe the existence of price discrimination against race in the loan application even after accounting for other contextual information. Our proposed pricing policy demonstrates a significant improvement, achieving 35.06% reduction in regret compared to the benchmark policy.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Indiana > Marion County > Indianapolis (0.04)
- (3 more...)
- Law (0.92)
- Banking & Finance > Loans (0.66)
Contextual Dynamic Pricing with Unknown Noise: Explore-then-UCB Strategy and Improved Regrets
Dynamic pricing is a fast-moving research area in machine learning and operations management. A lot of work has been done for this problem with known noise. In this paper, we consider a contextual dynamic pricing problem under a linear customer valuation model with an unknown market noise distribution F . This problem is very challenging due to the difficulty in balancing three tangled tasks of revenue-maximization, estimating the linear valuation parameter \theta_{0}, and learning the nonparametric F . To address this issue, we develop a novel {\it Explore-then-UCB} (ExUCB) strategy that includes an exploration for \theta_{0} -learning and a followed UCB procedure of joint revenue-maximization and F -learning.
Contextual Dynamic Pricing: Algorithms, Optimality, and Local Differential Privacy Constraints
Zhao, Zifeng, Jiang, Feiyu, Yu, Yi
We study the contextual dynamic pricing problem where a firm sells products to $T$ sequentially arriving consumers that behave according to an unknown demand model. The firm aims to maximize its revenue, i.e. minimize its regret over a clairvoyant that knows the model in advance. The demand model is a generalized linear model (GLM), allowing for a stochastic feature vector in $\mathbb R^d$ that encodes product and consumer information. We first show that the optimal regret upper bound is of order $\sqrt{dT}$, up to a logarithmic factor, improving upon existing upper bounds in the literature by a $\sqrt{d}$ factor. This sharper rate is materialised by two algorithms: a confidence bound-type (supCB) algorithm and an explore-then-commit (ETC) algorithm. A key insight of our theoretical result is an intrinsic connection between dynamic pricing and the contextual multi-armed bandit problem with many arms based on a careful discretization. We further study contextual dynamic pricing under the local differential privacy (LDP) constraints. In particular, we propose a stochastic gradient descent based ETC algorithm that achieves an optimal regret upper bound of order $d\sqrt{T}/\epsilon$, up to a logarithmic factor, where $\epsilon>0$ is the privacy parameter. The regret upper bounds with and without LDP constraints are accompanied by newly constructed minimax lower bounds, which further characterize the cost of privacy. Extensive numerical experiments and a real data application on online lending are conducted to illustrate the efficiency and practical value of the proposed algorithms in dynamic pricing.
- Asia > Middle East > Jordan (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Information Technology > Security & Privacy (1.00)
- Banking & Finance (0.67)